************************************************
Replication files for "A Database of the United States Supreme Court's Shadow Docket, 1993-2025," Journal of Law and Courts
Jonathan P. Kastellec, jkastell@princeton.edu
Anthony R. Taboni, anthony.taboni@austin.utexas.edu
************************************************

***************************************
***Instructions:

To replicate tables and figures in paper
1) Open main.R and update working directory to source file location
2) Run main.R
3) This will produce the figures and tables in the paper

To reconstruct shadow_docket_database_v2-0 (outside of scraping lower court dockets):
1) Open the terminal and set the directory to the scripts folder
2) Run 01_Journals_to_Docket_Day.py and then 02_Separate-Actions.py  which will process the journals
3) Open R and set the directory to the replication folder 
4) Run 06_Extract_Details.R and 07_Merge_Dockets.R

To rescrape lower court docket information:

First run steps 1 and 2 of the above instructions, then before steps 3 and 4 proceed as follows:

*****NOTE: This will take several days to complete****

1) Open the terminal and set the directory to the scripts folder
2) Run 03_Scrape_Lower_Court_Dockets_SCOTUS.py
3) Run 04_Scrape_Dockeets_National_Archives.py
4) Run 05_Process_Lower_Dockets.py

If you just wish to scrape new dockets between official updates of the data, the file collect_new_dockets.py will collect any dockets post the most recent Supreme Court term not yet included in the database. These dockets will still need to be processed.

***************************************






***Table of Contents

* main.R --- Master file which runs scripts needed to produce figures and tables in the main paper 


* `Documentation` folder
	** Coding_Protocols.pdf - Coding protocols for data
	** shadow_docket_database_codebook_v2-0.pdf - Codebook with a description for each variable included in `shadow_docket_database_v2-0`

* `Data` folder:
	** shadow_docket_database_v2_0.csv - Shadow docket database. Also available in .xlsx and .dta formats
	** justices.xlsx - Martin-Quinn scores as calculated by Martin and Quin 2002: https://mqscores.wustl.edu/measures.php
	** disagreements.xlsx - Hand-coded votes for non-unanimous cases
	** `SC_Journals` folder:
		*** SCJ_XX.pdf - Journal of the Supreme Court for the XX term. Available at https://www.supremecourt.gov/orders/journal.aspx. Includes the 1993-2024 terms
	** `Processed_Data` folder: 
		*** Categorized_Docket_Actions.xlsx: Information about the category for each docket-day-action
		*** Coding_Errors_Action_Class_Fixed: Hand coded action class for docket-day-actions missed by 06_Extract_Details.R
		*** Coding_Errors_Action_Class.xlsx: List of dockets where action-class is not automatically coded
		*** Coding_Errors_Relief_Fixed.xlsx: Hand coded relief for docket-day-actions missed by 06_Extract_Details.R
		*** Coding_Errors_Relief.xlsx: List of dockets wheere relief is not automatically coded
		*** Docket_Day_Action.xlsx: Uncategorized docket-day-actions. Created by 02_Separate_Actions.py
		*** Docket_Day.xlsx: Docket-Days. Created by 01_Journals_to_Docket_Day.py
		*** Docket_Information_1997_2024.xlsx: Information about lower courts and attorneys. Created by 	05_Process_Lower_Dockets.py
		*** SC_Bar_Admissions.xlsx: Admissions to the Supreme Court Bar. Created ny 01_Journals_to_Docket_Day.py
		*** `Lower_Court_Docket` folder:
			**** archives_dockets.csv: Unique National Archives identifiers for dockets. Created by 04_Scrape_Dockets_National_Archives.py
			**** Supreme_Court_Dockets_Raw_Text_1997_2001.json: Information about lower courts and attorneys from docket-book (1997-2001). Created by 04_Scrape_Dockets_National_Archives.py
			**** Supreme_Court_Dockets_Raw_Text_2001_2016.json: Information about lower courts and attorneys from docket-book (2001-2016). Created by 03_Scrape_Lower_Court_Dockets_SCOTUS.py
			**** Supreme_Court_Dockets_Raw_Text_2016_2024.json: Information about lower courts and attorneys from docket-book (2016-2024). Created by 03_Scrape_Lower_Court_Dockets_SCOTUS.py
			**** Supreme_Court_Dockets_Raw_Text_Missing.json: Information about lower courts and attorneys from docket-book missed by previous scrapes. Created by 05_Process_Lower_Dockets.py

* `Scripts` folder
	** 01_Journals_to_Docket_Day.py - Convert Journal of the Supreme Court to docket-day dataset. Generates Docket_Day.xlsx
	** 02_Separate_Actions.py - Separates docket-days into individual actions. Generates Docket_Day_Action.xlsx 
	** 03_Scrape_Lower_Court_Dockets_SCOTUS.py - Scrapes lower court information for dockets in the 2001-2024 terms
	** 04_Scrape_Dockets_National_Archives.py - Scrapes lower court information for dockets in the 1997-2001 terms
	** 05_Process_Lower_Dockets.py - Process lower court docket information and collects missing dockets. Generates Docket_Information_1997_2024.xlsx
	** 06_Extract_Details.R - Categorizes docket-day-actions. Generates Categorized_Docket_Actions.xlsx
	** 07_Merge_Dockets.R - Merges dockets with lower court information. Generates shadow_docket_database_v2-0
	** collect_new_dockets.py - Collects lower court dockets for new dockets filed between official database updates
	
* `Plots` folder
	** Figure_1.pdf - Example of Supreme Court order
	** Figure_2.pdf - Example of multiple orders being assigned the same text
	** Figure_3.pdf - Shadow docket actions over time: Produced by main.R
	** Figure_4.pdf - Certiorari over time: Produced by main.R
	** Figure_5.pdf - Grant rates over time: Produced by main.R
	** Figure_6.pdf - Emergency applications over time: Produced by main.R
	** Figure_7.pdf - Disagreement and dissent over time: Produced by main.R
	** Figure_8.pdf - Comparison in ideology estimates between MQ and shadow docket: Produced by main.R
	** Figure_9.pdf - Summary reversal over time: Produced by main.R

	
* `Tables` folder
	** Table 1: table1.tex  - Results from main analysis: main.R produces table1.tex 
*******************************************************************
***Software
All figures and analyses were created using macOS Tahoe Version 26.1. 

Some scraping of lower court dockets was done under previous macOS Sequoia versions. The replication code will run on macOS Tahoe Version 26.1

********
R - 4.5.1

**R packages
readxl_1.4.5          
lubridate_1.9.4   
forcats_1.0.0  
stringr_1.5.2
dplyr_1.1.4   
purrr_1.0.0
readr_2.1.5
tidyr_1.3.1 
tibble_3.3.0 
ggplot2_4.0.0         
tidyverse_2.0.0 
writexl_1.5.4
haven_2.5.5
MCMCpack_1.7-1  
MASS_7.3-65     
coda_0.19-4.1   
xtable_1.8-4    
gridExtra_2.3             
********

********
Python - 3.13.5

**Python packages
PyPDF2 3.0.1
pandas 2.3.3
pdfplumber 0.11.7         
selenium 4.35.0
bs4 0.0.2
********


****************************************************

*** Additional notes

* The runtime of main.R should take less than 10 minutes
* The runtime to reconstruct the dataset: 01_Journals_to_Docket_Day.py, 02_Separate-Actions.py, 06_Extract_Details.R and 07_Merge_Dockets.R will take under 30 minutes
* The runtime to rescrape the lower court dockets will take several days. 

****************************************************


*** References
*Martin, Andrew D., and Kevin M. Quinn. "Dynamic ideal point estimation via Markov chain Monte Carlo for the US Supreme Court, 1953–1999." Political Analysis 10, no. 2 (2002): 134-153.





